-
Notifications
You must be signed in to change notification settings - Fork 1.5k
fix(macros-core): more consistent env loading and reading logic #4018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(macros-core): more consistent env loading and reading logic #4018
Conversation
a8a9e0c
to
89858d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start, but needs a bit more work.
sqlx-macros-core/src/query/mod.rs
Outdated
static LOADED_ENV_VARS: Mutex<HashMap<String, String, BuildHasherDefault<DefaultHasher>>> = | ||
Mutex::new(HashMap::with_hasher(BuildHasherDefault::new())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't actually use a single global static
here, because of Rust-Analzyer. RA will load the proc-macro dylib into its process and keep it resident for all compiler invocations, so we can't assume the state is reset between different crates.
This was the cause of #3738 and the solution was to store all relevant context per-crate keyed by CARGO_MANIFEST_DIR
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the same reason, we might actually want to store the modified time of the .env
files we load and check for changes to that. If the file has been modified, we need to re-load it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about our approach to cache invalidation within this macro, and to be honest, I'm not sure I precisely understand what the goals and non-goals here are.
From what I can tell, our code for reacting to environment variable and .env
file changes depends on unstable proc_macro
features gated by procmacro2_semver_exempt
. Since it can be assumed that most people are using a stable Rust toolchain, most people aren't getting any sort of change tracking in the macro either, so why the suggestion to handle .env
files in a somewhat special way by manually checking their modification times? My understanding is that the intention is to rely on proc_macro::tracked_path
for refreshing these, right?
Now, onto the elephant in the room: the static METADATA
cache. While it is keyed by crate manifest path, cache entries are never invalidated, so the way I see it, that means the following sequence of events could happen even without this PR:
- The proc macro runs once for crate A with
procmacro2_semver_exempt
enabled, populatingMETADATA
and instructing the compiler to re-evaluate the macro when environment variables change. - The value of
DATABASE_URL
is modified. - The proc macro runs again for crate A after the compiler/RA notices the change. However, since a
METADATA
entry for crate A already exists, the proc macro library may not be unloaded between executions, and anyset_env
across threads is not necessarily thread-safe, the stale database URL from the cache may get used instead.
That may not be a huge issue in practice, since users always can restart RA, not use RA, or clear build caches if things go wrong. But it does raise the question of this PR's scope. Should we be trying to tackle cache invalidation here, or is that rabbit hole best left out of scope?
While giving the new `sqlx.toml` feature a try, I discovered that its `database_url_var` setting behaved inconsistently compared to other environment variables: it wasn't loaded from `.env` files, causing it to be effectively ignored in favor of the default `DATABASE_URL` variable. This is an inconvenient outcome for the multi-database workspaces that this feature is designed to support. To address this issue, I reworked the `.env` loading logic to account for this configuration and make it simpler to use to achieve consistent behavior. The new algorithm works as follows: - First, it generates lists of environment variables that can be loaded from `.env` files and candidate `.env` file paths. When applicable, the compiler is informed to track changes to their elements. - Next, it loads the set of potentially tracked environment variables using `set_var`, similar to how `dotenvy::dotenv()` operates. When a variable is defined in both the process environment and a `.env` file, the process environment takes precedence, as before. - Macro code can now use the `env` function freely to read environment variable values, abstracting itself away from their source, which results in simpler, less error-prone code. Trivially, this rework resolves the issue I encountered because the `database_url_var` value is now part of the list of loadable environment variables. Future code can easily make such additions as necessary.
2b5e2fd
to
57c1dcf
Compare
I've finally addressed most of the review comments in the latest commits I pushed to this PR, which I also rebased on top of the latest As requested, the static map of environment variables loaded from To preserve the ergonomics of the existing See also PR #4039 for a different implementation take on this problem. |
Context, problem statement and description
While giving the new
sqlx.toml
feature a try, I discovered that itsdatabase_url_var
setting behaved inconsistently compared to other environment variables: it wasn't loaded from.env
files, causing it to be effectively ignored in favor of the defaultDATABASE_URL
variable. This is an inconvenient outcome for the multi-database workspaces that this feature is designed to support.To address this issue, I reworked the
.env
loading logic to account for this configuration and make it simpler to use to achieve consistent behavior. The new algorithm works as follows:.env
files and candidate.env
file paths. When applicable, the compiler is informed to track changes to their elements.set_env
was used, but I changed it as a response to the review comments below). When a variable is defined in both the process environment and a.env
file, the process environment takes precedence, as before.env
function freely to read environment variable values, abstracting itself away from their source, which results in simpler, less error-prone code.Trivially, this rework resolves the issue I encountered because the
database_url_var
value is now part of the list of loadable environment variables. Future code can easily make such additions as necessary.Does your PR solve an issue?
To my knowledge, this PR doesn't directly address any previously reported issue in this repository.
Is this a breaking change?
Technically yes when compared to the released
0.9.0
alpha version, as environment variables likedatabase_url_var
may now be loaded from another source. However, I don't think this technically breaking change will cause significant inconvenience for most users, especially since the new behavior is more consistent and useful, and the affected variables are bound to not be widely used due to them being published as an alpha release so far.